Skip to content

feat(03-05_load_balancer): Add load-balancer endpoints example#13

Merged
deanq merged 5 commits intomainfrom
deanq/ae-1102-load-balancer-sls-resource
Jan 14, 2026
Merged

feat(03-05_load_balancer): Add load-balancer endpoints example#13
deanq merged 5 commits intomainfrom
deanq/ae-1102-load-balancer-sls-resource

Conversation

@deanq
Copy link
Copy Markdown
Member

@deanq deanq commented Jan 5, 2026

Prerequisite: runpod/flash#131, runpod-workers/flash#45

Summary

Add comprehensive example demonstrating Flash's load-balancer endpoints with custom HTTP routes. This example shows how to create low-latency APIs using the @remote decorator with method and path parameters.

Key Features

  • GPU service for compute-intensive operations
  • CPU service for data processing
  • Pydantic models for input validation
  • Comprehensive documentation with cost estimates and troubleshooting
  • LiveLoadBalancer for local development
  • LoadBalancerSlsResource configuration for production

Changes Made

  • Created complete load_balancer example in 03_advanced_workers/05_load_balancer/
  • Added pyproject.toml with project metadata
  • Added input validation using Pydantic models
  • Added Cost Estimates section to README
  • Added Troubleshooting section to README
  • Updated root README.md to include new example
  • Updated 03_advanced_workers/README.md to document the example
  • All code passes quality checks (formatting, linting, type checking)

Testing

  • make quality-check passes
  • make consolidate-deps run successfully
  • Code formatted with ruff
  • All files follow project standards

Documentation

  • Comprehensive README with architecture, examples, and API docs
  • Cost estimates section
  • Troubleshooting section with 5 common issues
  • Repository documentation updated

Create a comprehensive example demonstrating Flash's load-balancer endpoints with custom HTTP routes. This example shows how to create low-latency APIs using the @Remote decorator with method and path parameters.

Key additions:
- GPU service for compute-intensive operations
- CPU service for data processing
- Pydantic models for input validation
- Comprehensive documentation with cost estimates and troubleshooting
- LiveLoadBalancer for local development
- LoadBalancerSlsResource configuration for production

Improvements made:
- Added pyproject.toml with project metadata
- Added input validation using Pydantic models
- Added Cost Estimates section to README
- Added Troubleshooting section to README
- Updated root README.md to include new example
- Updated 03_advanced_workers/README.md to document the example

All code passes quality checks: formatting, linting, and type checking.
@linear
Copy link
Copy Markdown

linear Bot commented Jan 5, 2026

@deanq deanq changed the title feat(05_load_balancer): Add load-balancer endpoints example feat(03-05_load_balancer): Add load-balancer endpoints example Jan 5, 2026
Improvements:
- Remove redundant validation in GPU worker (Pydantic already validates)
- Use UTC timezone for all timestamps instead of local timezone
- Extract VALID_OPERATIONS to module-level constant in CPU worker
- Clarify port documentation in README

Changes:
- GPU endpoint.py: Remove redundant empty list check, use UTC timestamps
- CPU endpoint.py: Add VALID_OPERATIONS constant, use UTC timestamps, simplify operation validation
- README.md: Clarify port differences between standalone (8000) and unified app (8888)

All changes pass quality checks.
@deanq deanq requested a review from Copilot January 5, 2026 05:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive example demonstrating Flash's load-balancer endpoints with custom HTTP routes using the @remote decorator with method and path parameters.

Key Changes:

  • Created a complete load-balancer example with GPU and CPU services
  • Added comprehensive documentation including cost estimates and troubleshooting sections
  • Updated repository documentation to include the new example

Reviewed changes

Copilot reviewed 14 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
README.md Added load_balancer example to the advanced workers section
CLAUDE.md Updated naming convention comment for resource configuration
03_advanced_workers/README.md Added detailed documentation for the load_balancer example
03_advanced_workers/05_load_balancer/workers/gpu/endpoint.py Implemented GPU service endpoints with compute operations
03_advanced_workers/05_load_balancer/workers/gpu/init.py Created FastAPI router for GPU endpoints
03_advanced_workers/05_load_balancer/workers/cpu/endpoint.py Implemented CPU service endpoints with text processing operations
03_advanced_workers/05_load_balancer/workers/cpu/init.py Created FastAPI router for CPU endpoints
03_advanced_workers/05_load_balancer/requirements.txt Added tetra_rp dependency
03_advanced_workers/05_load_balancer/pyproject.toml Added project metadata and dependencies
03_advanced_workers/05_load_balancer/main.py Created unified FastAPI application
03_advanced_workers/05_load_balancer/README.md Added comprehensive documentation with examples and troubleshooting
03_advanced_workers/05_load_balancer/.gitignore Added standard Python and project-specific ignore patterns
03_advanced_workers/05_load_balancer/.flashignore Added Flash build ignore patterns
03_advanced_workers/05_load_balancer/.env.example Added environment variable template

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread 03_advanced_workers/05_load_balancer/workers/gpu/endpoint.py Outdated
Comment thread 03_advanced_workers/05_load_balancer/workers/cpu/endpoint.py
Changes:
- Use ComputeRequest model as parameter in compute_intensive() function for proper type safety and validation
- Update GPU router to pass full request object instead of extracting numbers
- Update test code to create ComputeRequest object properly

This ensures consistent use of Pydantic models for validation throughout the GPU worker endpoint.

All changes pass quality checks.
@deanq deanq requested a review from Copilot January 5, 2026 05:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 16 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

deanq added 2 commits January 5, 2026 21:11
…unction

Remote functions with load-balancer decorators don't have access to custom
types like Pydantic models in their execution context. Move ComputeRequest
to the router module and convert it to a dict before passing to the remote
function, matching the pattern used in the CPU worker.
@deanq deanq merged commit 62d44a7 into main Jan 14, 2026
6 checks passed
@deanq deanq deleted the deanq/ae-1102-load-balancer-sls-resource branch January 14, 2026 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants